Goto

Collaborating Authors

 mean estimation


Mean Testing under Truncation beyond Gaussian

arXiv.org Machine Learning

We characterize the fundamental limits of high-dimensional mean testing under arbitrary truncation, where samples are drawn from the conditional distribution $P(\cdot \mid S)$ for an unknown truncation set $S$ that may hide up to an $\varepsilon$-fraction of the probability mass. For distributions with $p$-th directional moments of magnitude at most $ν_{P,p}$, truncation induces a bias of order $O(ν_{P,p}\varepsilon^{1-1/p})$. This bias creates a sharp information-theoretic detectability floor: when the signal $α$ falls below this threshold, the null and alternative hypotheses are indistinguishable even with infinite data. Above this floor, we prove that a simple second-order test achieving near-optimal sample complexity $n = O\!\left(\frac{\|Σ_P\|}{(α-4ν_{P,p}\varepsilon^{1-1/p})^2}\sqrt{d}\right)$. We further identify a structural escape from this finite-moment bias barrier. Under a directional median regularity assumption, truncation bias improves to linear order $O(\varepsilon)$. This reveals an intermediate regime in which estimation requires $Θ(d)$ samples for uniform recovery, while testing recovers the classical $Θ(\sqrt d)$ rate once truncation bias is eliminated. Together, our results provide a unified framework for mean testing under truncation, connecting finite-moment, sub-Gaussian, and median-regular structural regimes.


Instance-optimal Mean Estimation Under Differential Privacy

Neural Information Processing Systems

Mean estimation under differential privacy is a fundamental problem, but worstcase optimal mechanisms do not offer meaningful utility guarantees in practice when the global sensitivity is very large. Instead, various heuristics have been proposed to reduce the error on real-world data that do not resemble the worst-case instance. This paper takes a principled approach, yielding a mechanism that is instance-optimal in a strong sense. In addition to its theoretical optimality, the mechanism is also simple and practical, and adapts to a variety of data characteristics without the need of parameter tuning. It easily extends to the local and shuffle model as well.



Learning with User-Level Privacy

Neural Information Processing Systems

We propose and analyze algorithms to solve a range of learning tasks under userlevel differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution (m 1 samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis classes with finite metric entropy, the privacy cost decreases as O(1/ m) as users provide more samples.




Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Neural Information Processing Systems

Informally, given n& d/α2 samples from such a distribution with mean µand covariance Σ, our estimators output µsuch that k µ µkΣ α, where k kΣ is the Mahalanobis distance. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require Ω(d3/2) samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance, without releasing the covariance itself. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.



Outlier-Robust Sparse Estimation via Non-Convex Optimization

Neural Information Processing Systems

We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks.1 The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.


Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Neural Information Processing Systems

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean µ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates µwith high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability, having an additive log(1/) dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.